Modelling speech line spectral frequencies with dirichlet mixture models

نویسندگان

  • Zhanyu Ma
  • Arne Leijon
چکیده

In this paper, we model the underlying probability density function (PDF) of the speech line spectral frequencies (LSF) parameters with a Dirichlet mixture model (DMM). The LSF parameters have two special features: 1) the LSF parameters have a bounded range; 2) the LSF parameters are in an increasing order. By transforming the LSF parameters to the ΔLSF parameters, the DMM can be used to model the ΔLSF parameters and take advantage of the features mentioned above. The distortion-rate (D-R) relation is derived for the Dirichlet distribution with the high rate assumption. A bit allocation strategy for DMM is also proposed. In modelling the LSF parameters extracted from the TIMIT database, the DMM shows a better performance compared to the Gaussian mixture model, in terms of D-R relation, likelihood and model complexity. Since modelling is the essential and prerequisite step in the PDF-optimized vector quantizer design, better modelling results indicate a superior quantization performance.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Super-Dirichlet Mixture Models Using Differential Line Spectral Frequencies for Text-Independent Speaker Identification

A new text-independent speaker identification (SI) system is proposed. This system utilizes the line spectral frequencies (LSFs) as alternative feature set for capturing the speaker characteristics. The boundary and ordering properties of the LSFs are considered and the LSF are transformed to the differential LSF (DLSF) space. Since the dynamic information is useful for speaker recognition, we ...

متن کامل

Text-to-speech voice adaptation from sparse training data

Voice adaptation describes the process of converting the output of a text-to-speech synthesizer voice to sound like a different voice after a training process in which only a small amount of the desired target speaker’s speech is seen. We employ a locally linear conversion function based on Gaussian mixture models to map bark-scaled line spectral frequencies. We compare performance for three di...

متن کامل

Speech Enhancement Using Gaussian Mixture Models, Explicit Bayesian Estimation and Wiener Filtering

Gaussian Mixture Models (GMMs) of power spectral densities of speech and noise are used with explicit Bayesian estimations in Wiener filtering of noisy speech. No assumption is made on the nature or stationarity of the noise. No voice activity detection (VAD) or any other means is employed to estimate the input SNR. The GMM mean vectors are used to form sets of over-determined system of equatio...

متن کامل

Spectral normalization employing hidden Markov modeling of line spectrum pair frequencies

This paper proposes a spectral normalization approach in which the acoustical qualities of an input speech waveform are mapped onto that of a desired neutral voice. Such a method can be e ective in reducing the impact of speaker variability such as accent, stress, and emotion for speech recognition. In the proposed method, the transformation is performed by modeling the temporal characteristics...

متن کامل

A novel technique for voice conversion based on style and content decomposition with bilinear models

This paper presents a novel technique for voice conversion by solving a two-factor task using bilinear models. The spectral content of the speech represented as line spectral frequencies is separated into so-called style and content parameterizations using a framework proposed in [1]. This formulation of the voice conversion problem in terms of style and content offers a flexible representation...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010